Maria Musial 156062
Computer vision - Lab 7¶
Agenda¶
Image segmentation based on:
- thresholding
- cluster analysis,
- detecting image features (e.g. edges),
- region growing,
Helpers¶
%matplotlib inline
import glob
import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import PIL
import plotly.graph_objects as go
import plotly.io as pio
from IPython.display import HTML, display
from matplotlib.colors import ListedColormap
from pandas import DataFrame
from sklearn.mixture import GaussianMixture
pd.options.display.html.border = 0
pd.options.display.float_format = '{:,.2f}'.format
Images¶
- Image Lenna (available at the link) - one of the most popular images historically used for testing image processing and compression,
- clevr -comes from the CLEVR dataset that deals with the Visual Query Answering problem,
- graf - sample graffiti image from the OpenCV repository OpenCV,
- sudoku - sample sudoku image from OpenCV repository,
- skittles - several images containing skittles
# # download images
# !wget -O lena_std.tif http://www.lenna.org/lena_std.tif
# !wget -O clevr.jpg https://cs.stanford.edu/people/jcjohns/clevr/teaser.jpg
# !wget -O graf.png https://github.com/opencv/opencv/raw/master/samples/data/graf1.png
# !wget -O sudoku.png https://raw.githubusercontent.com/opencv/opencv/master/samples/data/sudoku.png
# for i in range(100, 111):
# !wget -O skittles{i}.jpg https://github.com/possibly-wrong/skittles/blob/master/images/{i}.jpg?raw=true
Visualization¶
def imshow(a):
a = a.clip(0, 255).astype("uint8")
if a.ndim == 3:
if a.shape[2] == 4:
a = cv2.cvtColor(a, cv2.COLOR_BGRA2RGBA)
else:
a = cv2.cvtColor(a, cv2.COLOR_BGR2RGB)
display(PIL.Image.fromarray(a))
css = """
<style type="text/css">
table, td, table.dataframe, table.dataframe td {
border: 1px solid black; //border: double;
border-collapse: collapse;
border-style: solid;
border-spacing: 0px;
background-color: rgb(250,250,250);
width: 18px;
height: 18px;
text-align: center;
transform: scale(1.0);
margin: 2px;
}
</style>
"""
def h(s):
return display(HTML(css + DataFrame(s).to_html(header=False, index=False)))
def h_color(a, cmap="gray", scale=2):
s = [a.shape[0] * scale, a.shape[1] * scale]
plt.figure(figsize=s)
plt.tick_params(
axis="both",
which="both",
bottom=False,
top=False,
labelbottom=False,
labelleft=False,
left=False,
right=False,
)
plt.imshow(a, cmap=cmap)
cmap = ListedColormap(
[
"black",
"tomato",
"chocolate",
"darkorange",
"gold",
"olive",
"green",
"deepskyblue",
"blueviolet",
"hotpink",
]
)
def h_grid(grid, scale=1):
h_color(grid, cmap, scale)
def pix_show(pixels, skip_each=1, height=400, width=400, colors=None):
pixels = pixels[::skip_each]
if colors is None:
colors = pixels[:, ::-1]
else:
colors = colors[::skip_each]
b, g, r = pixels[:, 0], pixels[:, 1], pixels[:, 2]
fig = go.Figure(
data=[
go.Scatter3d(
x=b,
y=g,
z=r,
mode="markers",
marker={"size": 2, "color": colors, "opacity": 0.7},
)
],
layout_xaxis_range=[0, 1],
layout_yaxis_range=[0, 1],
)
scene = {
"xaxis": dict(title="Blue"),
"yaxis": dict(title="Green"),
"zaxis": dict(title="Red"),
}
fig.update_layout(
autosize=False, height=height, width=width, scene=scene, showlegend=True
)
pio.show(fig)
Image segmentation¶
Image segmentation is a crucial technique in computer vision that involves partitioning an image into distinct regions or segments, each representing a meaningful part of the image. By classifying pixels into categories, segmentation helps in understanding the structure and content of an image, making it essential for applications such as object detection, medical imaging, autonomous driving, and image editing.
There are two primary types of image segmentation:
- Semantic segmentation - assigns a label to every pixel in an image based on its category, ensuring that all pixels belonging to the same object type (e.g., "tree" or "car") are grouped together.
- Instance segmentation - not only assigns category labels to pixels but also distinguishes between different instances of the same object category.
The main techniques of classical semantic segmentation:
- Thresholding methods: These techniques segment an image by converting it into binary form using one or more thresholds applied to a specific color space.This separates regions of interest from the background based on intensity or color values.
- Edge detection-based segmentation: By detecting changes in intensity, these methods identify object boundaries, using edge-detection algorithms like Sobel or Canny to segment images.
- Region-based segmentation: This approach groups neighboring pixels with similar properties, such as color or intensity, to form cohesive regions, often starting with a seed point and expanding outward.
- Graph-based segmentation: Images are represented as graphs, where pixels or regions are nodes, and edges define the relationships between them, typically based on similarity in color, intensity, or spatial proximity. The goal is to partition the graph into subsets or segments such that nodes within a segment are highly similar, while nodes across different segments are significantly dissimilar.
- Segmentation by clustering: This method involves grouping pixels into clusters based on their feature similarity (e.g., color, intensity, or texture), using clustering algorithms. Typically, the arrangement of pixels and their neighborhood relationships are not directly considered, focusing instead on their feature-based attributes.
Segmentation through thresholding¶
The easiest way to segment an image is to perform a pixel intensity thresholding. The thresholding operation consists in replacing all intensities above a certain threshold with a certain constant value, and with another value below this threshold.
There is also segmentation with multiple thresholds, which was presented in the first class.
lena = cv2.imread("./lena_std.tif", cv2.IMREAD_COLOR)
imshow(cv2.resize(lena, None, fx=0.4, fy=0.4))
lena_gray = cv2.cvtColor(lena, cv2.COLOR_BGR2GRAY)
imshow(cv2.resize(lena_gray, None, fx=0.4, fy=0.4))
lut = np.array([255] * 100 + [0] * 100 + [255] * 56)
lena_lut = cv2.LUT(lena_gray, lut)
imshow(lena_lut)
# imshow(cv2.resize(lena_lut, None, fx=0.4, fy=0.4))
The OpenCV library also includes a ready-made implementation of other simple image thresholding approaches. To perform the thresholding operation in OpenCV, the threshold() function should be called, which takes the image, the threshold value, the maximum value and the threshold method that should be used.
The available thresholding methods include:
- binary - Set to 0 for pixels with intensities below the threshold, the value , for the others maximum value,
- binary inverted - Set to 0 for pixels with intensities above the threshold, for the remaining pixels maximum value,
- truncate - Set to the threshold value for pixels with intensities above the threshold; leave others unchanged.
- to zero - Set to 0 for pixels with intensities below the threshold, the others remain unchanged,
- to zero inverted - Set to 0 for pixels with intensities above the threshold, the others remain unchanged,
_, lena_bin = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_BINARY)
_, lena_bin_inv = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_BINARY_INV)
_, lena_trunc = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TRUNC)
_, lena_tozero = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TOZERO)
_, lena_tozero_inv = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TOZERO_INV)
imshow(cv2.resize(np.concatenate([lena_gray, lena_bin, lena_bin_inv], 1), None, fx=0.4, fy=0.4))
imshow(cv2.resize(np.concatenate([lena_trunc, lena_tozero, lena_tozero_inv], 1), None, fx=0.4, fy=0.4))
Sudoku example¶
sudoku = cv2.imread("./sudoku.png", cv2.IMREAD_GRAYSCALE)
imshow(cv2.resize(sudoku, None, fx=0.4, fy=0.4))
print(sudoku.shape)
(563, 558)
_, sudoku_bin = cv2.threshold(sudoku, 70, 255, cv2.THRESH_BINARY)
imshow(cv2.resize(sudoku_bin, None, fx=0.4, fy=0.4))
lut = np.array([0] * 50 + [255] * 80 + [0] * 126)
sudoku_lut = cv2.LUT(sudoku, lut)
imshow(sudoku_lut)
# imshow(cv2.resize(sudoku_lut, None, fx=0.4, fy=0.4))
OTSU¶
OTSU is an algorithm designed to dynamically determine the threshold value, ensuring that the pixel intensity variance within both resulting classes is minimized, effectively maximizing the inter-class variance.
The minimized function utilized by OTSU is expressed as: $$\sigma^2_w(t) = Q_1\sigma^2_1 + Q_2\sigma^2_2$$ where:
- $Q_i$ - represents the probability of a pixel belonging to the i-th class. This probability is derived from the cumulative distribution function: $$Q_i = P_i(f(x,y) < t_i)$$
- $\sigma^2_i$ - denotes the variance within the i-th class.
To identify the optimal threshold value, the objective is to minimize the expression across all possible values (from 0 to 255).
The initial step involves calculating probability density function (PDF) so the probability of a pixel having a specific intensity (ranging from 0 to 255). This can be achieved by determining the histogram and subsequently normalizing it. Additionally, the cumulative distribution function (CDF) will be computed to facilitate the calculation of mean and conditional variances.
def calculate_pdf_cdf(img):
h = cv2.calcHist([img], [0], None, [256], [0, 256])
pdf = h.ravel() / h.sum()
cdf = np.cumsum(pdf)
return pdf, cdf
pdf, cdf = calculate_pdf_cdf(sudoku)
plt.plot(pdf)
plt.show()
plt.plot(cdf)
plt.show()
In a single iteration of the OTSU algorithm, where a specific threshold is set, the algorithm divides the probability distribution into the probabilities of pixel occurrences in both classes. The determination of the mean value involves the use of the formula for the conditional expected value (because we are calculating the average of pixels under the condition that a specific class occurs, expressed as the: ''expected value for a given class'').
$$ E(C_1 | t_1) = \sum_{x =1}^{t_1} \frac{xP(x)}{Q_1}$$
Then, to calculate the (conditional!) variances, we use the previously calculated conditional expected values. $$\sigma^2_1 = \sum_{x =1}^{t_1} \frac{(x - E[C_1|t_1])^2P(x)}{Q_1}$$
def cond_mean(x, p, q):
# Calculate the conditional mean for a given class
return np.sum(x * p / q)
def otsu_one_step(x, pdf, cdf, epsilon=1e-6):
"""
Calculates one step of the ostu algorithm for one threshold value x.
"""
# Split intensity values into lower and higher than the threshold
x_low, x_high = np.arange(x), np.arange(x, 256)
# Split probability densities into lower and higher than the threshold
p_low, p_high = pdf[:x], pdf[x:]
# Calculate cumulative probabilities for lower and higher classes
q_low, q_high = cdf[x], cdf[-1] - cdf[x]
# Check for small cumulative probabilities to avoid division by zero
if q_low < epsilon or q_high < epsilon:
return None
# Calculate conditional means for lower and higher classes
m_low, m_high = cond_mean(x_low, p_low, q_low), cond_mean(x_high, p_high, q_high)
# Calculate conditional variances for lower and higher classes
s_low = cond_mean((x_low - m_low) ** 2, p_low, q_low)
s_high = cond_mean((x_high - m_high) ** 2, p_high, q_high)
# Combine conditional variances to optimize Otsu's method
return s_low * q_low + s_high * q_high
In the most basic version, the OTSU algorithm, we iterating over all possible divisions (threshold from 1 to 254) and selecting the one for which the previously presented objective function returns the smallest value.
def otsu(img):
pdf, cdf = calculate_pdf_cdf(img)
v_min = None
threshold = 0
for i in range(1, 254):
v = otsu_one_step(i, pdf, cdf)
if v is not None and (v_min is None or v < v_min):
v_min = v
threshold = i + 1
_, img_otsu = cv2.threshold(img, threshold, 255, cv2.THRESH_BINARY)
return threshold, img_otsu
In order to use the OSTU method in opencv, just add THRESH_OTSU to the thresholding operation.
th_self, sudoku_otsu_self = otsu(sudoku)
th_auto, sudoku_otsu_auto = cv2.threshold(
sudoku, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU
)
print(
"Found value of thresholding with the OTSU algorithm (custom implementation):",
th_self,
)
print("Found value of thresholding with the OTSU algorithm (OpenCV):", th_auto)
print("\nOTSU (custom implementation):")
imshow(cv2.resize(sudoku_otsu_self, None, fx=0.4, fy=0.4))
print("\nOTSU (OpenCV):")
imshow(cv2.resize(sudoku_otsu_auto, None, fx=0.4, fy=0.4))
Found value of thresholding with the OTSU algorithm (custom implementation): 96 Found value of thresholding with the OTSU algorithm (OpenCV): 96.0 OTSU (custom implementation):
OTSU (OpenCV):
Adaptive methods¶
Among the thresholding methods, there are also adaptive methods. These are thresholding methods that adjust the threshold value depending on the image content.
Adaptive thresholding methods often work very well when the input image is divided into smaller areas and the threshold value is adjusted separately for each area. The motivation behind such a mechanism is the fact that in real images the lighting (as well as focus, balance, etc.) is uneven.
- ADAPTIVE_THRESH_MEAN_C - The threshold value corresponds to the average of the neighborhood area of size blockSize, subtracted by a constant.
- ADAPTIVE_THRESH_GAUSSIAN_C - The threshold value corresponds to the Gausian-weighted sum of the neighborhood area of size blockSize, subtracted by a constant.
blockSize = 101
constant = 2
lena_ad_mean = cv2.adaptiveThreshold(
lena_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, blockSize, constant
)
lena_ad_gauss = cv2.adaptiveThreshold(
lena_gray,
255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY,
blockSize,
constant,
)
imshow(cv2.resize(np.concatenate([lena_ad_mean, lena_ad_gauss], 1), None, fx=0.4, fy=0.4))
blockSize = 3
constant = 2
sudoku_ad_mean = cv2.adaptiveThreshold(
sudoku, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, blockSize, constant
)
sudoku_ad_gauss = cv2.adaptiveThreshold(
sudoku,
255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY,
blockSize,
constant,
)
imshow(cv2.resize(np.concatenate([sudoku_ad_mean, sudoku_ad_gauss], 1), None, fx=0.4, fy=0.4))
Segmentation of multi-channel images¶
Segmenting multi-channel images, such as RGB, using straightforward thresholding methods becomes challenging due to the necessity of defining thresholds in N-dimensional space. As a result, cluster analysis methods are frequently employed instead of simplistic thresholding for the segmentation of multi-channel images.
Cluster analysis involves identifying clusters of pixels within a specific space, sometimes directly in the intensity space. This process entails creating distinct pixel classes based on their characteristics within that space.
To illustrate, let's conduct a basic pixel intensity analysis on the given image below.
graf = cv2.imread("./graf.png", cv2.IMREAD_COLOR)
graf = cv2.resize(graf, None, fx=0.75, fy=0.75)
imshow(cv2.resize(graf, None, fx=0.4, fy=0.4))
The image presented as a list of pixels (BGR) was displayed in 3D space, where the coordinates of a given pixel are its intensity values. Additionally, the pixels have been colored according to their intensities.
graf_pixels = graf.reshape([-1, 3])
pix_show(graf_pixels, 16)
From the analysis of the above visualization, a few conclusions can be drawn:
- most pixels lie on a straight line between black (0, 0, 0) and white (255, 255, 255),
- a few clusters for colors can be distinguished:
- Red,
- blue,
- gold / orange,,
- green,
- claret,
- dark purple,
One method of splitting spaces into clusters is Gaussian Mixture, which approximates the distribution of clusters using N Gaussian distributions. It is a method with training parameters, so there is a need for a certain sample of data to which we could fit a mathematical model.
We will use a pixel list (BGR) as the data to which we will fit the model, and then assign each of them a number of Gaussian systems to which it belongs with the highest probability.
# model initialization and training
model = GaussianMixture(n_components=8)
model.fit(graf_pixels)
# assigning classes to pixels
segments = model.predict(graf_pixels)
print(segments)
[1 1 1 ... 2 2 2]
The next step will be to calculate the average color for each class (segment) and display the pixels again with the colors representing the segmentation.
segments_colors = np.stack([graf_pixels[segments == i].mean(0) for i in range(8)], 0)
colors = np.take(segments_colors, segments, 0)
pix_show(graf_pixels, 16, colors=colors[:, ::-1])
Pixels with mapped classes are the final image segmentation based on cluster analysis. The input image and the segmentation effect are shown below.
segmented = colors.reshape(graf.shape)
imshow(cv2.resize(np.concatenate([graf, segmented], 1), None, fx=0.4, fy=0.4))
Segmentation by edge detection¶
Segmentation through edge detection is based on the knowledge learned in the previous class as part of detecting key points, corners and edges.
The idea is to divide the image on the basis of edges, and then fill closed areas, assigning subsequent identifiers to subsequent separable areas.
The assignment of identifier areas is presented in the next section.
clevr = cv2.imread("./clevr.jpg", cv2.IMREAD_COLOR)
clevr = cv2.resize(clevr, None, fx=0.5, fy=0.5)
imshow(cv2.resize(clevr, None, fx=0.4, fy=0.4))
clevr_gray = cv2.cvtColor(clevr, cv2.COLOR_BGR2GRAY)
imshow(cv2.resize(clevr_gray, None, fx=0.4, fy=0.4))
canny_high, _ = cv2.threshold(clevr_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
canny_low = 0.5 * canny_high
clevr_canny = cv2.Canny(clevr_gray, canny_low, canny_high, 9) #sobel operator size 9
clevr_canny = cv2.morphologyEx(
clevr_canny, cv2.MORPH_CLOSE, kernel=np.ones((3, 3), np.uint8)
)
imshow(cv2.resize(clevr_canny, None, fx=0.4, fy=0.4))
Segmentation by region growing¶
Segmentation by region growing consists in iteratingly joining adjacent areas until a certain condition is met. The areas are joined after meeting the uniformity test, while the algorithm is executed until the stop condition is met.
Uniformity test - combining areas and checking a certain condition. An example condition might be: the average difference of pixel intensity in both areas. If it is greater than a certain threshold, then the areas are not uniform and there is no connection.
Stop condition - this condition can be treated as no further merges of areas or as a condition for early stopping the algorithm (e.g. when we want the areas not to be larger than the set limit.
For region growing, previously detected edges can be used as region boundaries, so the stop condition means that all pixels inside the area surrounded by edges belong to the region.
regions = np.zeros(clevr_canny.shape[:2], np.int32) #high, width, ignore color channel
neighbours = [(-1, 0), (1, 0), (0, -1), (0, 1)]
def find_neighbours(img, y, x):
c_neighbours = []
for dy, dx in neighbours:
ny, nx = y + dy, x + dx
if ny < 0 or ny >= img.shape[0] or nx < 0 or nx >= img.shape[1]:
continue
if regions[ny, nx] > 0:
continue
if img[ny, nx] == 255:
continue
if img[y, x] == img[ny, nx]: # Uniformity test
c_neighbours.append((ny, nx))
return c_neighbours
def grow_region(img, y, x, cls):
regions[y, x] = cls
c_neighbours = find_neighbours(img, y, x) #find initial region neighbours from starting pixel
for ny, nx in c_neighbours:
regions[ny, nx] = cls
while len(c_neighbours) > 0: #iterative searching for extension of the region
new_neighbours = []
for ny, nx in c_neighbours:
i_new_neighbours = find_neighbours(img, ny, nx)
for _ny, _nz in i_new_neighbours:
regions[_ny, _nz] = cls
new_neighbours.extend(i_new_neighbours)
c_neighbours = new_neighbours
i = 1
for y in range(clevr_canny.shape[0]):
for x in range(clevr_canny.shape[1]):
if regions[y, x] == 0 and clevr_canny[y, x] == 0: #if its not assigned yet and if the pixel is not an edge
grow_region(clevr_canny, y, x, i) #i is for colouring, IDing the region
i += 1
mean_colors = np.stack(
[
np.array([255, 255, 255]) if j == 0 else clevr[regions == j].mean(0)
for j in range(i)
],
0,
)
regions_colors = np.take(mean_colors, regions, 0)
imshow(cv2.resize(regions_colors, None, fx=0.4, fy=0.4))
In this case (detected edges), a similar result can also be obtained using connected component.
kernel = np.ones((3, 3), np.uint8)
edges = cv2.dilate(clevr_canny, kernel, iterations=1)
ret, markers = cv2.connectedComponents(255-edges) #ret-#of connected components, markers- 2d array where each pixel is labeles with regionsID (0 for background)
mean_colors = np.stack(
[
np.array([255, 255, 255]) if j == 0 else clevr[regions == j].mean(0) #compute mean color of regions
for j in range(ret)
],
0,
)
regions_colors = np.take(mean_colors, regions, 0) #for each pixel in regiond take mean color corresponding to its region label from mean_colors
imshow(cv2.resize(regions_colors, None, fx=0.4, fy=0.4))
C:\Users\mary9\AppData\Local\Temp\ipykernel_19676\4270557554.py:7: RuntimeWarning: Mean of empty slice. c:\Users\mary9\anaconda3\envs\cv_lab\lib\site-packages\numpy\core\_methods.py:184: RuntimeWarning: invalid value encountered in divide
Tasks¶
Task 1¶
Like the section on multi-channel image segmentation, perform the same pixel intensity cluster analysis for the './skittles100.jpg' image and then segment the image using the K-Means algorithm (available, among others, in the scikit library: sklearn.cluster.KMeans) .
Present the intermediate results:
- BGR input image
- BGR pixels in 3D space,
- segmentation result on BGR pixels in 3D space,
- segmentation result as a 2D image (BGR)
from sklearn.cluster import KMeans
skittles = cv2.imread("skittles100.jpg", cv2.IMREAD_COLOR)
imshow(cv2.resize(skittles, None, fx=0.4, fy=0.4))
skittles_pixels = skittles.reshape([-1, 3])
pix_show(skittles_pixels, 16)
Elbow method to get number of clusters that will work well with our segmentation later
data = skittles_pixels
def elbow_method(data, k_range):
inertia = [] # list to store the inertia values
for k in k_range: # iterate over the range of k values
kmeans = KMeans(n_clusters=k, n_init='auto') # create a KMeans instance with k clusters
kmeans.fit(data) # fit the data to the KMeans instance
inertia.append(kmeans.inertia_) # append the inertia value to the inertia list
plt.figure(figsize=(8, 4))
plt.plot(k_range, inertia, marker='o')
plt.xlabel('Number of clusters (k)')
plt.ylabel('Inertia')
plt.title('Elbow Method for Optimal k')
plt.grid(True)
plt.show()
elbow_method(data, range(1, 12)) # call the elbow_method function with the data and a range of k values
kmeans = KMeans(n_clusters=9, n_init='auto', random_state=42).fit(skittles_pixels)
segments = kmeans.predict(skittles_pixels)
segments_colors = np.stack([skittles_pixels[segments==i].mean(0) for i in range(9)], 0) #get colors of each segment
colors = np.take(segments_colors, segments, 0) #map colors to pixels
pix_show(skittles_pixels, 16, colors=colors[:, ::-1])
segmented = colors.reshape(skittles.shape)
imshow(cv2.resize(np.concatenate([skittles, segmented], 1), None, fx=0.4, fy=0.4))
Task 2¶
Using the methods you learned in the previous class, find the number of Skittles in the image './skittles100.jpg'. (it is not necessary to use the solution from task 1) Show intermediate results and describe the processing steps in the comment.
Show the original image with founded individual skittles marked on it.
How it works:¶
- get_segmented_explainable: read image, reshape to 1D array, fit KMeans, predict clusters, unify background, get colors of each cluster, map colors to pixels, reshape to image (I tried Gaussian blur, but it gave worse results.)
- count_skittles_explainable: binarizing image, deleting corners, closing in opening to separate skittles, growing circles back to skittles size, counting regions, getting edges of found skittles
- put_edges_explainable: putting the edges on the image, visualising results
All those functions have their version suited for task3, so that my computer doesnt explode when generating images and graphs.
#read image, reshape to 1D array, fit KMeans, predict clusters, unify background, get colors of each cluster, map colors to pixels, reshape to image
#Gaussian blur gave worse results.
def get_segmented_explainable(path):
skittles = cv2.imread(path, cv2.IMREAD_COLOR)
imshow(cv2.resize(skittles, None, fx=0.4, fy=0.4))
skittles_pixels = skittles.reshape([-1, 3])
print("Pixels in RGB space:")
pix_show(skittles_pixels, 16)
kmeans = KMeans(n_clusters=9, n_init='auto', random_state=42).fit(skittles_pixels)
segments = kmeans.predict(skittles_pixels)
segments_colors = np.stack([skittles_pixels[segments==i].mean(0) for i in range(9)], 0) #get colors of each segment
colors = np.take(segments_colors, segments, 0) #map colors to pixels
print("Pixels clustered:")
pix_show(skittles_pixels, 16, colors=colors[:, ::-1])
segmented = colors.reshape(skittles.shape)
print("Segmented image:")
imshow(cv2.resize(np.concatenate([skittles, segmented], 1), None, fx=0.4, fy=0.4))
cols=[]
for color in segments_colors:
cols.append(np.full((100,100,3),color))
imshow(cv2.resize(np.concatenate(cols, 1), None, fx=0.4, fy=0.4))
#I want to unify background, get all grey segments and reassign values to one of greys
grey_segments = []
grey_color = np.array([128, 128, 128])
for i, color in enumerate(segments_colors):
if np.allclose(color, grey_color, atol=100):
grey_segments.append(i)
print("Background clusters:", grey_segments)
segments[np.isin(segments, grey_segments)] = grey_segments[0] # get background to be one
segments_colors = []
for i in range(9):
segment_pixels = skittles_pixels[segments == i]
if segment_pixels.size > 0:
segment_mean = segment_pixels.mean(0)
segments_colors.append(segment_mean)
else:
segments_colors.append(np.array([128, 128, 128], dtype=np.uint8)) # Default grey color for empty segments
segments_colors = np.array(segments_colors, dtype=np.uint8)
colors = np.take(segments_colors, segments, 0) #map colors to pixels
segmented = colors.reshape(skittles.shape)
print("Segmented image with unified background:")
imshow(cv2.resize(np.concatenate([skittles, segmented], 1), None, fx=0.4, fy=0.4))
return segmented
#binarizing image, deleting corners, closing in opening to separate skittles, growing circles back to skittles size, counting regions, getting edges of found skittles, visualising results
def count_skittles_explainable(image):
#WHAT ARE WE WORKING ON
bin_skit = (image[:, :, 0] > 100).astype(np.uint8) * 255
bin_skit = cv2.bitwise_not(bin_skit)
print("The image we're counting skittles on:")
imshow(cv2.resize(image, None, fx=0.4, fy=0.4))
print("The binarized version:")
imshow(cv2.resize(bin_skit, None, fx=0.4, fy=0.4))
#Deleting corners
height, width = bin_skit.shape
corner_size = 50 # Size of the corner regions to mask
mask = np.full(bin_skit.shape, 255, dtype=np.uint8)
corner_positions = [
(0, 0),
(width - corner_size, 0),
(0, height - corner_size),
(width - corner_size, height - corner_size)
]
for x, y in corner_positions:
cv2.rectangle(mask, (x, y), (x + corner_size, y + corner_size), 0, -1)
bin_skit[mask == 0] = 0
print("Deleted corners, to focus on skittles:")
imshow(cv2.resize(bin_skit, None, fx=0.4, fy=0.4))
#CLOSING IN OPENING TO SEPARATE SKITTLES
struct = np.ones((3, 3), np.uint8)
clos = cv2.dilate(bin_skit, struct, iterations=4)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
img_bin_dil = cv2.erode(clos, struct, iterations=13)
img_bin_dil_ker = cv2.erode(img_bin_dil, kernel, iterations=16) #right number of skittles!!!!
print("Closing to close 's' on skittles, erosion to separate them to be able to count them")
imshow(cv2.resize(np.concatenate([img_bin_dil, img_bin_dil_ker],1), None, fx=0.4, fy=0.4))
num_labels, labels = cv2.connectedComponents(img_bin_dil_ker)
print(f"Number of regions (skittles): {num_labels - 1}")
#Grow circles back to skittles size
kernel = np.ones((30,30), np.uint8)
grown_regions = np.zeros_like(img_bin_dil_ker, dtype=np.uint8)
for label in range(1, num_labels):
label_mask = np.uint8(labels == label) * 255
dilated_mask = cv2.dilate(label_mask, kernel, iterations=1)
grown_regions = cv2.bitwise_or(grown_regions, dilated_mask)
edges = cv2.Canny(grown_regions, threshold1=100, threshold2=200)
print("Growing regions back to skittles size, edges:")
imshow(cv2.resize(np.concatenate([img_bin_dil_ker, grown_regions, edges], 1), None, fx=0.4, fy=0.4))
return num_labels - 1, edges
def put_edges_exp(path):
skittles, edges = count_skittles_explainable(get_segmented_explainable(path))
image = cv2.imread(path, cv2.IMREAD_COLOR)
edges_rgb = cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR)
imshow(cv2.resize(np.concatenate([image, edges_rgb], 1), None, fx=0.4, fy=0.4))
edges_mask = edges_rgb == 255
image[edges_mask] = 255
return image
#VISUALISING THE RESULTS FOR TASK3, just image->results
def count_skittles(image):
#WHAT ARE WE WORKING ON
bin_skit = (image[:, :, 0] > 100).astype(np.uint8) * 255
bin_skit = cv2.bitwise_not(bin_skit)
#Deleting corners
height, width = bin_skit.shape
corner_size = 50 # Size of the corner regions to mask
mask = np.full(bin_skit.shape, 255, dtype=np.uint8)
corner_positions = [
(0, 0),
(width - corner_size, 0),
(0, height - corner_size),
(width - corner_size, height - corner_size)
]
for x, y in corner_positions:
cv2.rectangle(mask, (x, y), (x + corner_size, y + corner_size), 0, -1)
bin_skit[mask == 0] = 0
#CLOSING IN OPENING TO SEPARATE SKITTLES
struct = np.ones((3, 3), np.uint8)
clos = cv2.dilate(bin_skit, struct, iterations=4)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
img_bin_dil = cv2.erode(clos, struct, iterations=13)
img_bin_dil_ker = cv2.erode(img_bin_dil, kernel, iterations=16) #right number of skittles!!!!
num_labels, labels = cv2.connectedComponents(img_bin_dil_ker)
#Grow circles back to skittles size
kernel = np.ones((30,30), np.uint8)
grown_regions = np.zeros_like(img_bin_dil_ker, dtype=np.uint8)
for label in range(1, num_labels):
label_mask = np.uint8(labels == label) * 255
dilated_mask = cv2.dilate(label_mask, kernel, iterations=1)
grown_regions = cv2.bitwise_or(grown_regions, dilated_mask)
edges = cv2.Canny(grown_regions, threshold1=100, threshold2=200)
return num_labels - 1, edges
def get_segmented(path):
skittles = cv2.imread(path, cv2.IMREAD_COLOR)
skittles_pixels = skittles.reshape([-1, 3])
kmeans = KMeans(n_clusters=9, n_init='auto', random_state=42).fit(skittles_pixels)
segments = kmeans.predict(skittles_pixels)
segments_colors = np.stack([skittles_pixels[segments==i].mean(0) for i in range(9)], 0) #get colors of each segment
colors = np.take(segments_colors, segments, 0) #map colors to pixels
segmented = colors.reshape(skittles.shape)
cols=[]
for color in segments_colors:
cols.append(np.full((100,100,3),color))
#I want to unify background
grey_segments = []
grey_color = np.array([128, 128, 128])
for i, color in enumerate(segments_colors):
if np.allclose(color, grey_color, atol=100):
grey_segments.append(i)
segments[np.isin(segments, grey_segments)] = grey_segments[0] # get background to be one
segments_colors = []
for i in range(9):
segment_pixels = skittles_pixels[segments == i]
if segment_pixels.size > 0:
segment_mean = segment_pixels.mean(0)
segments_colors.append(segment_mean)
else:
segments_colors.append(np.array([128, 128, 128], dtype=np.uint8)) # Default grey color for empty segments
segments_colors = np.array(segments_colors, dtype=np.uint8)
colors = np.take(segments_colors, segments, 0) #map colors to pixels
segmented = colors.reshape(skittles.shape)
return segmented
def put_edges_task3(path):
skittles, edges = count_skittles(get_segmented(path))
image = cv2.imread(path, cv2.IMREAD_COLOR)
image_t = image.copy()
edges_rgb = cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR)
edges_mask = edges_rgb == 255
image_t[edges_mask] = 255
imshow(cv2.resize(np.concatenate([image, image_t],1), None, fx=0.4, fy=0.4))
imshow(cv2.resize(put_edges_exp("skittles100.jpg"), None, fx=0.4, fy=0.4))
Pixels in RGB space:
Pixels clustered:
Segmented image:
Background clusters: [0, 2, 7, 8] Segmented image with unified background:
The image we're counting skittles on:
The binarized version:
Deleted corners, to focus on skittles:
Closing to close 's' on skittles, erosion to separate them to be able to count them
Number of regions (skittles): 60 Growing regions back to skittles size, edges:
Task 3¶
- Test the solution from task 2 for the remaining skittels images.
- Improve the solution so that it works properly for this images.
for file in glob.glob("./skittles*"):
print(file)
skittles = cv2.imread(file, cv2.IMREAD_COLOR)
# imshow(cv2.resize(skittles, None, fx=0.4, fy=0.4))
put_edges_task3(file)
# I improved the erosion and closing, so that it gets more skittles right.
# Problem is with cluster of yellows in 109, and images where flash+"s" is quite big on skittles.
# Noise in 103 (clump of skittles?) is dealt with.
#Overall results are +/- 4 skittles.
.\skittles100.jpg
.\skittles101.jpg
.\skittles102.jpg
.\skittles103.jpg
.\skittles104.jpg
.\skittles105.jpg
.\skittles106.jpg
.\skittles107.jpg
.\skittles108.jpg
.\skittles109.jpg
.\skittles110.jpg